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Abstract 

A transmitted, unknown radar signal is observed at the receiver through more than one path in additive 
noise. The aim is to recover the waveform of the intercepted signal and to simultaneously estimate 
the direction of arrival (DOA). We propose an approach exploiting the parsimonious time-frequency 
' representation of the signal by applying a new OMP-type algorithm for structured sparsity patterns. An 

,-Ch ! important issue is the scalability of the proposed algorithm since high-dimensional models shall be used 

for radar signals. Monte-Carlo simulations for modulated signals illustrate the good performance of the 
method even for low signal-to-noise ratios and a gain of 20 dB for the DOA estimation compared to 
some elementary method. 
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I. Introduction 



The aim of SIGnals INTelligence (SIGINT) is to intercept as much information as possible on received 
signals. For radar sources, the parameters of interest are the Direction Of Arrival (DOA) of the direct 
path (assuming it is observed), but also, if possible, the carrier frequency and the modulation scheme. 
In practice, due to the presence of multipath propagation, the interception of radar sources remains a 
difficult problem. Indeed, multipath propagation creates nuisance parameters that have to be taken into 
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account such as DOAs and relative Times of Arrival (TOA) of reflected paths, relative signal power levels 
on the different paths and additive noise variance. 

In this paper, we shall consider a unique narrowband farfield source propagating through several paths, 
corrupted with additive Gaussian white noise and observed on a multi-sensor array. This issue is actually 
very involved since the observed signals on the different sensors are highly correlated and the different 
replicates are likely to overlap. This prevents us from using methods such as MUSIC, ESPRIT or MVDR 
which do not work very well with coherent signals [1]. These approaches, based on subspace methods 
or on maximum likelihood estimation, have been extended by 0, 0] and [5] to deal with coherent 
sources. However, to the best of our knowledge, estimating the waveform, which is an infinite-dimensional 
parameter, in a nonparametric way has never been considered for multipath signals. To solve this problem, 
our main idea consists in exploiting the sparsity of the time-frequency representation of the waveform, 
which means that the selection of possibly useful signal components from a huge collection of candidate 
signals may be driven by a sparsity constraint. Such kind of approach was initially proposed by [6| but 
only for the estimation of the time-delays, which is a finite dimensional parameter, and is extended here 
for estimating the DOAs and TOAs as well as the waveform. 

For dealing with the estimation in sparse linear regression models, the Lasso and the greedy 
orthogonal matching pursuit (OMP) ( JSJ, 0) have become very popular tools. In our case, an inspection 
of our model shows that the parameter vector is not sparse in an arbitrary way: its sparsity pattern has 
a specific structure. In order to include prior information concerning the sparsity structure, different 
approaches have recently been proposed. On the one hand, there are methods based on composed 
^1/^2-penalties (elastic nets ifTOlk fused Lasso ifTTl : group Lasso |[l2lk composite absolute penalty ITT3) : 
overlapping groups lTT4l . [15]), on the other hand, |H6] proposed the group-OMP method for non- 
overlapping groups, which is an extension of the orthogonal matching pursuit to structured solutions. 

In this paper we explain how to extend the group-OMP to overlapping groups and how to deal with 
the scalability of our algorithm which is a crucial issue in the context of intercepted radar signals since 
the model dimension is usually very high. A simulation study shows that the proposed method performs 
well for composed and modulated signals even when the signal-to-noise ratio is low. A gain of more 
than 20 dB for the DOA estimation is achieved in comparison with some elementary method. 

The paper is organized as follows. In Section [TFJ the mathematical model for radar signals is introduced, 
together with a reformulation of the problem in the form of a sparse, partly linear model. In Section 
imi an OMP-type algorithm is developed which is well adapted to our model. Section [TV] deals with 
the scalability issue of the algorithm, while Section [V] presents an appropriate stopping criterion for the 
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OMP-type algorithm. Finally, the experimental results of Section [VI] illustrate the performance of the 
new algorithm and a comparison with some elementary method is provided. Conclusions are made in 
the final Section IW] 

II. Model 

A. Signal Modelling 

Let us denote by so(t) the original radar signal emitted by the source, where t is the time. Due to 
propagation and reflexion on obstacles, the signal arrives via several pathways on the receiver. With each 
pathway we associate a direction of arrival, say D u , a time-delay, say T u and a complex constant A u 
representing the attenuation of the signal on the u-th pathway. The signal is detected by an antenna with 
C sensors whose array response is denoted by a(-). Then the signal arriving at the sensor array at time 
t is a C-dimensional vector given by 

u 

s(t) = Y, A uS (t-T u )a(D u ) , (1) 

u=l 

where U denotes the number of propagation pathways, which is unknown. As the time lags T u are 
typically much shorter than the total length of the signal sq, the arriving signal s is the superposition of 
several delayed replicates sq with different amplitudes. 

In practice, the signal s is observed with some additional noise, say e, and at a sample frequency, say 
F s . With A s = 1/F S the observations are hence given by 

y m = s(mA s ) + e m , m = l,...,M , (2) 

where e m 6 C c is a noise vector. Generally one considers circular gaussian independent random vectors 
with zero mean and covariance a 2 Ic- 

B. Linear Regression Model 

In practice, all model parameters are unknown, except the array response function a(-), which is 
determined by the array geometry or by calibration. As the relationship between the parameters is rather 
involved, we propose to reformulate the estimation problem by linearizing the model. This is obtained, 
on the one hand, by representing the waveform in an overcomplete basis and, on the other hand, by 
discretizing the parameter spaces of the other parameters. This leads to a linear model with a high- 
dimensional parameter vector and nonlinear constraints. As this vector is structured and sparse, i.e. most 
entries are zero, regularization methods with structured sparsity inducing norms can be applied [13] . 
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1 ) Decomposition of the signal : To estimate the waveform, the radar signal sq is represented in some 

given (overcomplete) basis or dictionary V = {ifj,j = 1, . . . , J}, such that 

J 

s o = ^2Pj ( Pj: (3) 

3=1 

with appropriate coefficients f3j G C. Hence, estimating sq becomes recovering the coefficients j3j. When 
using appropriate, large dictionaries, a small number of elements (fj is sufficient to reconstruct sq. In 
other words, it is assumed that the dictionary is such that most coefficients /3j are zero and hence that 
the representation of so is sparse. 

2) Propagation model: To estimate the parameters A u , D u and T u by discretization, we introduce 
grids {n, . . . , Tp} and {9%, . . . , 9q} of potential values of the delay times T u and the angles of arrival 

D u respectively, as in J6l, ifTTl . Then s(t) can be rewritten as 

P Q 

s(t) = 5^5^ a P><? a (t - Tp) a(0 q ) , (4) 

p=l q=l 
U 

with a Pi q = ^ A uH T u = t p , D u = 9 q } , 

u=l 

where 1 is the indicator function. Clearly, there are only U coefficients a Pi q that are nonzero. Hence, 
when fine grids are used, i.e. P and Q are large, this is another sparse estimation problem. As the number 
of pathways U is generally unknown, model selection is required, i.e. the estimation of the model order 
U. 

By combining (0 and (01), the observation y m reads 

J P Q 

ym = ^2^2^2 a P^3 C Pj ( mA * ~ T p) V-{6q) + 4 , (5) 
3=1 P =1 Q= l 

for m = 1 , . . . , M. It is convenient to represent the model in matrix notation. Let X be the matrix of 
size CM x JPQ, where all known quantities are stored. More precisely, the (J,p,q)-th predictor, i.e. 
the column Xj jPtq of X, is given by 

/ tpj(A s -T p )a(9 q ) \ 



D 3.P.<? 



€ C 



CM 



\ (fj(M A s - T p ) 3(9 q ) J 

In other words, Xj iPiq corresponds to the signal received by the antenna (without noise) when the emitted 
signal is the element (fj arriving with delay t p and angle 9 q . Denote the observation vector by Y = 
(yf, . . . ,Vm) T G C CM and the noise vector by e = (e[ , . . . ,£^) T G C . Finally, let the Kronecker 
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product /3<g)a of the vectors (3 = (f3i, . . . , (3j) T and a = (ai,i, • • • ,ap,Q) T be the JPQ-vector u> defined 
as 



w 



w 



V 



/3l«l,2 



(6) 



Then model © can be written as 

Y = X(/3 ® a) + e with a G C PQ , /3 G C J . (7) 
Finally, introduce the following set 

W = {w G C JPQ : w = 13 ® q, a G C PQ , /3 G C 7 } , 

which is a subspace of C JP ® that can be parametrized using J + PQ variables. Remark that, given a 
vector w G W, one can recover the vectors a and j3 only up to a multiplicative constant, which does not 
matter in our application. Thus, model d5) can be written as the following linear regression model 



Y = Xt« + e with w G W, 



(8) 



where the design matrix X is known and the parameter vector w is to be estimated from the observations 
Y. The particularity of this regression model is that the search space for w is not the entire space 

C JPQ 

but the smaller, nonconvex set W. 



III. Estimation procedure 

The problem now is to estimate the vectors a and (3 by fitting model (Q, or equivalently, to estimate 
w in model ([8]). The advantage of ([8]) over |7]) is the linearity of the model. Since the dimension of the 
space W is much higher than that of the observations Y, the minimization problem 



min \\Y — Xwll 2 

wew 



(9) 



is ill-posed. However, we note that w must be a sparse vector, since a and j3 are assumed sparse. It is 
well known that sparsity leads to stable estimation procedures based on a regularized fitting criterion with 
an £i-penalty (Lasso 0; basis pursuit ifTBl : Lars |[T9l ) or greedy algorithms (as matching pursuit and 
orthogonal matching pursuit (OMP) O, COT ). A closer inspection of W reveals some structure in the 
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sparsity patterns, that means, there are constraints on the distribution of the zero entries for all vectors 
w in W. More precisely, the set of indices S = {(j,p,q) ■ such that Wj tM = 0} is called the sparsity 
pattern of some given vector w. Indeed, all elements w of W verify that a p > jq > = implies that the 
components Wj tP > q > = for all j, and likewise, if /3j> = then Wj> jPjq = for all p and q. Thus, w 
must not be sparse in an arbitrary way, but its sparsity pattern has to respect some structure. Modern 
estimation procedures force specific structures of the sparse vector. On the one hand, there are methods 
based on composed £\/£2 -penalties (elastic nets ifTOlk fused Lasso iPTTTk group Lasso Ifl2) : composite 
absolute penalty Ifl3l ; overlapping groups lTT4l . |fT51 ), on the other hand, the orthogonal matching pursuit 
can be extended to structured solutions (group-OMP for non-overlapping groups lfl6*l ). 
Our problem can be related to a penalized minimization problem of the form 

min U\Y - X.wf + n(w)\ , (10) 

with a structured £\j£2 -penalty Q(w) (detailed below) in the sense that the sparsity pattern of a solution 
of this problem, say w, respects the required structure. However, there is no guarantee that the nonzero 
components of w satisfy relation ©, that is, that there are coefficients a p>q and (3j such that Wj ;P;9 = 
a Pi q/3j and hence w may not belong to W. The problem lies in the nonzero components of w. To compute 
the solution of (fTOl . different solutions have been proposed in the literature as the active set algorithm in 
in |[2ll and |[T5l . In this paper we propose an extension of the group-OMP |[T6l for overlapping groups. 
Now, to fit the model given in ([7]) we propose the following procedure consisting of two steps, that are 
developed below in more detail. 

Step 1. (Model selection) Solve problem (fTOl via an OMP-type algorithm to obtain a solution w with 
admissible sparsity structure. Identify the nonzero components Wj :P: q and the indices A of the associated 
coefficients a Pi9 and fij that must be nonzero. 

Step 2. (Estimation in reduced dimension) Reduce the dimension of the regression matrix X by keeping 
the predictor Xj jPjq only if Wj iPi q ^ 0. Then compute the least squares estimator in model (0 by the 
Nelder-Mead simplex method. This problem is now well-posed because of the largely reduced dimension. 

A. Model selection step 

To specify the penalty Q(w) in (fTOl . we describe the structure of the sparsity pattern S of w € W. 
To this end, for a given couple (p,q), we denote the set of indices G pq = {{j,p,q), j 6 {1, • • • ,«/}}, 
and likewise, for a given j, the set of indices Gj = {(j,p, q), p G {1, . . . , P}, q E {1, ... , Q}}- It is 
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clear that the sparsity pattern S of any w in W can be written as the union of all sets Gp q and Gj with 

a p , q = and /3j = 0. 

Given a set of indices G, let ||u>g||2 denote the ^ 2 - norm of the subvector of w composed of the elements 
w j,p,q w i tn tii q) ^ G. Then one can show that the solution, say w, of (fTOl) with the following penalty 

n(w) = Ai \\ w g«J\2 + \\v)G$h > (H) 

p.? i 

where Ai, A2 > are regularization parameters, has the required sparsity structure, i.e. the sparsity pattern 
of w can be written as the union of sets Gp" q and Gj. Note that these groups are possibly overlapping. 
Indeed, the penalty can be viewed as a mixture of t\- and ^-norms appropriate for overlapping group 
structures. More precisely, the penalty is the sum (i.e. the £i-norm) of the ^-norm of subvectors wq- 
Likewise to the standard Lasso, the £i-norm entails that for several groups G the terms H^clb are zero, 
implying in turn that all entries of wq are exactly zero. By using two regularization parameters Ai > 
and A2 > one can use different degrees of sparsity for the vectors a and /3. 

To solve the problem (TTOb with penalty Q(w) given by (fTTT) we propose an OMP-type algorithm. To 
describe the algorithm we introduce the collection Q = {G g ,g G 1} of all sets Gp q and Gj indexed by 
a common index g with index set 1. Note that the cardinality of T is PQ + J. For a set of indices A, 
denote by the matrix made of the corresponding predictors x^ v ^ q of X such that (j, p, q) S A. 

Recall that the principle of OMP consists in adding iteratively the predictor to the current solution 
which is the most correlated with the current residual = Y — Xu?w. In |[T6ll this principle is extended 
to selecting non-overlapping groups of variables, where Q is a set of pairwise disjoint sets G g and one 
adds all variables of group G g * if ||X^ ( r'*' = max ge x ||X^ r' { ' |||. Proceeding in this way guarantees 
that at every step of the algorithm, the current solution respects the required sparsity structure, which 
means that the sparsity pattern of the current solution 117W equals the union of some groups G g G Q. 

Now this procedure can be extended to our context with overlapping group structures, where the aim 
remains the same: the sparsity pattern of every intermediate solution must be the union of some 
groups G g G Q. The difference to the group-OMP of |[T6l is at the level of the selection of variables 
that may enter the solution. Indeed, due to the overlapping group structure, the sets of variables that 
may be activated at the next step are not simply the groups G g G Q. To be more precise on this issue, 
we introduce the set c I of indices of zero groups G g associated with the current solution ujW, 
such that = \J geZ {t)G g is the sparsity pattern of Clearly, the indices of the non zero entries 
of 10 w are given by A^' = (S^) c , where A c denotes the complementary set of A with respect to Q. 
Now the candidate groups of components that may be activated at the next step are obtained by deleting 
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one of the indices from the zero groups Z®. That means, every potential group of indices has the form 
Vg Q = (U gl zzw\{g }G 'g) c \A^' where go G Z^\ Then, following the philosophy of OMP, one activates 
the components given by V g * if ||X^ »r^||| = max go ^zw ||X^ ^lli- Once a group V g - is selected, 
one derives the new set of active indices ,4.(* +1 ) = A^ U V g * and updates the current solution by solving 
the least squares problem of reduced dimension vj^^X) = argmin„, \\Y — X^ct+ij w\^. 

We notice that the algorithm can be viewed as a walk through a directed acyclic graph, which also 
appears in the algorithm proposed in lTT5l . Formally, our OMP-type algorithm can be described as follows. 

Algorithm 1 (OMP-type algorithm for overlapping groups): Initialization. Set t = 0. 
Repeat until some stopping criterion is satisfied. 
I f t = (initial step) 

Compute (j*,p*,q*) = arg max \\xjj P(7 )^|||- 

Denote by g* a and the group indices associated with 

the groups Gp» q * and G\-», respectively 

Put the set of zero group indices Z^> = X\{<7*, g^}. 

Put the set of active indices A^ = {(J*,p*, q*)}. Else 

For each go E Z^> set 

V go = (u ge2W \ {go} G g ) c \A^ . 

Compute g* = arg max |[(X-p 3 ) T r^|||. 

g ez (t) g ° 

Update = Z®\{g*} and A {t+l "> = A® U V g ~. 

End 

Update the current solution 

Wju^v, =arg min \\Y - X^+^Hli ■ ( 12 ) 

u, e CI^ (t+1) l 

Update the residual A t+ ^ = Y — X Ai t + i ) w ( - t+1 ' ) . 
Set t = t + 1. 

End 

Note that in any case the algorithm stops after a finite number of iterations, more exactly after J+PQ—l 
steps. Then all groups are activated and the solution w is the ordinary least squares estimator. The 
important issue of an appropriate stopping criterion for the algorithm is addressed in Section [V] Clearly, 
the aim is to stop the algorithm earlier to provide a relevant, sparse solution. 
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We remark that the proposed algorithm is a natural extension of the orthogonal matching pursuit. 
Moreover it is similar to the active set algorithm described in lfl5l . which provides an interesting 
mathematical analysis of the optimization problem given by ( fTOl ). Nevertheless, our OMP-type algorithm 
provides an approach which is feasible in practice and computationally faster than the active set algorithm. 
Indeed, in (fl2l ) the solution is updated by the ordinary least squares estimator that is known explicitly, 
whereas Ifl5l proposes to solve the penalized problem of reduced dimension by a more expensive second- 
order cone programming. 

B. Solution of the reduced problem 

The solution of (fTOl) is a sparse vector, say w, whose sparsity pattern serves to identify the indices of the 
nonzero coefficients a M and /3j. More precisely, if Wj tPi q 7^ 0, this implies that the associated coefficients 
a Pi g and (3j are both non zero. Conversely, Wj tPtq = implies that at least one of the coefficients is 0. 

To obtain estimates of these nonzero coefficients, we reduce the dimension of the regression matrix 
X by keeping the predictor Xj. p ^ q only if Wj tPtq 7^ 0. Denote the resulting matrix by X re d- Then compute 
the least squares estimator by minimizing 

L(aJ) = ||y-X red (/3 0a)|| 2 , (13) 

where a and j3 are the associated a- and /3-vectors of reduced dimension PQ and J, respectively This 
minimization is a well-posed problem as the dimension of X rec i is much smaller than that of X. 

To compute the solution of (TT3l one can use the fact that the model is linear in a for fixed f3 and 
conversely. That means for fixed (3 we define the matrix X(/3) with PQ columns by 

J 

X red (/3) = ^/3,m i ,...,^). 

i=i 

Then the minimum of a 1— > L(a, /?) is the ordinary least squares estimator given by 

a 0LS (/3) = (X red (/3) T X red (/3))- 1 X red (/3) T y . 

Furthermore, as a and j3 are identifiable only up to some multiplicative constant, one can set f3\ = 1, 
for instance. Finally, it remains to minimize 

02,..., h) ^ L (« OLS ((i, h,..., /3j) T ), a, h, ■ ■ ■ , hf) 

which can be done by the Nelder-Mead simplex method. This is feasible since the number of unknown 
variables, i.e. J — 1, is quite low. 
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IV. Scalability 

To obtain a good representation of the waveform, a large dictionary shall be used. Likewise, to avoid 
biased estimators of angles and time delays, we may use fine grids. However, large dictionary sizes J 
and large grid sizes PQ entail a huge regression matrix X having JPQ columns (see Section |VT1 where 
JPQ fa 6.2 10 9 ). This raises computational difficulties, namely concerning the storage of the regression 
matrix X. 

A closer look at the OMP-type algorithm shows that there are no computations involving the entire 
regression matrix. Essentially, the algorithm considers one by one all possible groups of variables G € Q 
that may enter the solution in the present iteration. For each group of variables G the correlation of the 
associated predictors with the current residual r' = Y — X^-u^') is computed. Hence, instead of the 
whole matrix X, only the predictors of group G are required. 

In short, the storage of X can be avoided by recomputing the required predictors at each iteration. 
With such a programming there are almost no limits on the size of the regression matrix, and thus almost 
arbitrary dictionaries and grids may be used. 

To be more precise, let us have a look at the dictionary used in the following simulation study. The 
dictionary is composed of sinusoids with different frequencies /, different lengths I and different stalling 
points s. The index j is hence a triple index (/, I, s). Now instead of storing the whole vector Xj pq , we 
just use the triple index (/, l,s) to reconstruct the dictionary element tpj, whenever it is needed. Then 
further translation and multiplication with the appropriate steering vector yield the required predictor 
x j,p,q- Finally* we only have to store the steering matrix for all possible angles of arrival, which is a 
matrix of very moderate size C x P. With this knowledge, the predictor Xj iPi q is easily reconstructed at 
any iteration of the algorithm. 

In regression problems it is common to work with a normalized regression matrix, that is all columns 
have mean zero and standard deviation one. To avoid the computation of the normalization constants for 
every predictor, they may be computed just once and stored for later usage. Note that the normalization 
constants do not depend neither on the starting point of the dictionaiy element nor on the delay on the 
pathway but only on the frequency / and the length I of the dictionary element and the angle of arrival. 
Hence, the number of constants to store remains reasonable. 

The most time-consuming step in the algorithm is the scan of all predictors (or more precisely, of all 
relevant groups of predictors) in every iteration, to identify the one that is the most correlated with the 
current residual. It is noteworthy that this step can be speeded up by parallelization of the algorithm. 
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Indeed, there is no specific order to visit the predictors and the problem can be split in several independent 
subtasks. 

V. Stopping criterion 

The OMP-type algorithm requires an appropriate stopping criterion. A common approach to conceive 
a good stopping criterion is based on cross-validation ||T3l . Note that in our context, we do not have to 
deal with a single, but with two sparse vectors, namely a and /3 and their degrees of sparsity may not 
be equal. Hence, cross-validation would be computationally demanding. 

Here we propose a stopping criterion that on the one hand takes into account available prior information 
on the number of propagation pathways (and hence on the number of o-variables) and on the other hand 
has a data-driven component to determine the total number of non zero a Piq - and /3j -coefficients. 

A. Sparsity of a p ^-coefficients 

In the context of intercepted radar signals, one generally expects a small number of relevant propagation 
pathways. Indeed, we are mainly interested in the recovery of the direct path to determine the principal 
DOA. It is hence convenient to introduce an upper limit for the number of paths f/ max and to modify 
the OMP-type algorithm such that at every iteration the number of activated a Pi(? -coefncients is checked. 
If their number has achieved U max , then we stop activating further a Pig -coefficients and concentrate on 
activating only /^-coefficients. 

In simulations we observed that this procedure yields satisfactory results. The only important assump- 
tion for a good performance is that the direct path is not too weak compared to the indirect pathways, 
since the algorithm activates the components by their importance in the signal. 

Note that, if <7 max is smaller than the 'true' number of propagation pathways, then the observed signal 
Y cannot be completely explained. In other words, the solution obtained by the OMP-type algorithm is 
not the optimal solution of the minimization problem (TTOb . 

B. Control of the squared error 

The degree of sparsity in the /3-variables depends on the chosen dictionary. In general, very few prior 
information is available on the number J of basis elements used to decompose the radar signal. 

A simple but useful tool for model selection consists in studying the evolution of the squared error, 
that is the evolution of the square norm of the current residual L(w^') = \\Y — Xu}(*)|||, where 
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Fig. 1. Radar signal and its reconstruction. All signals, except (a), are on the first sensor, (e)-(h) represent the real part. 



denotes the solution at the t-th iteration. Obviously, the squared error L(w^) is monotone decreasing 
over the iterations. 

The OMP-type algorithm always adds the group of variables that are the most correlated with the 
current residual, that is the components that are likely to diminish the squared error the most. This 
explains why the algorithm tends to first activating all 'true' variables, and afterwards selecting variables 
that are not in the true model. Hence the problem of selecting the model order becomes the detection of 
the point when all 'true' variables are activated and the algorithm starts adding pure noise variables. 

In fact, activating a 'true' variable generally improves the squared error a lot, since the new variable 
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explains a considerable part of the observations. Whereas adding a variable which is not in the true model 
is less beneficial and the squared error hardly decreases. Consequently, first the squared error L(w®) 
decreases steeply until the point where all 'true' variables are activated. Then the graph of the squared 
error becomes very flat. This change of the slope yields a kink in the graph of the squared error L(w®). 
Thus, the kink indicates the number of 'true' variables. 

An efficient way to determine the kink consists in computing the convex hull of the graph. Then the 
kink is given by the endpoint of the longest linear piece of the convex hull 11221 . |[23l . 

We may hence proceed as follows. The OMP-type algorithm is run with some fixed U max and stopped 
after a large number of iterations. It is useful to store all intermediate solutions w^' and the associated 
square losses. Afterwards we determine the final solution as the intermediate solution where the 
squared error presents a kink. 

Clearly, the kink is especially pronounced when the signal-to-noise ratio is high. Conversely, when 
there is too much noise in the data, then it may be difficult to make out a kink. Besides, when the true 
model contains a relatively 'weak' variable, that is, its contribution to the signal is rather low, then the 
number of variables may be underestimated as the 'weak' variable can be confounded with noise. 

VI. Experimental Results 

A. Setting 

We illustrate the algorithm by an example. Let us consider a radar signal made of three subsequent 
sinusoids with different frequencies (100, 140, 110 MHz) and a gap between the two last sinus waves 
(sinusoids with starting points 0, 150, 300 ns and lengths 150, 100, 120ns), see Figure [TJa). The signal 
is detected by a uniform linear array (ULA) of 5 sensors separated by half a wavelength of the actual 
narrowband source signals. The sample frequency is F s = 1.28 GHz and the total duration of observation 
is 0.8 /xs. Thus, we have M = 1024 observation points and Y is a vector of length 5120. Furthermore, 
we consider three pathways (U = 3) with unknown angles of arrivals (9°, 13°, 38°) and delays (60, 95, 
120ns). The signal (without noise) received at the first sensor of the antenna is presented in Figure [TJb) 
and (c), where the superposition of the sinusoids results from the multiple propagation pathways. From 
these figures it is clear that the multiple paths have a strong impact on the signal and hence cannot be 
neglected in the analysis. The signal is corrupted by additive noise with a signal-to-noise ratio of 5 dB, 
see Figure Old). The signal-to-noise ratio is determined on the direct propagation pathway, defined as the 
10 log (energy of the signal on the first path/variance of the noise) as used in |[24l or 0. 
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We choose a Fourier dictionary containing sine and cosine functions with frequencies 90, 95,. . . , 150 
MHz, signal lengths 80, 85,. . . , 200 ns and starting points 0, 35, 65,. . . , 2555 ns, where 5 = l/F s 0.78 
ns. The size of the dictionary is thus J = 166400. For the angles and the delays we use the grids {1°, 
2°,. . . , 90°} and {0, 35, 65,. . . , 2555}, respectively Hence, w is a vector of dimension 3.83 • 10 9 . 

B. Signal reconstruction by the OMP-type algorithm 

In the simulations we set C/ max = 2, that is, at best we recover two of the three pathways and we 
stop the algorithm after 9 iterations. Figure [He)-(h) illustrate the first four iterations of the OMP-type 
algorithm. In the first iteration the algorithm selects a single sinusoid. Next, a second pathway is selected 
leading to two overlapping sinusoids. As now two a-variables are activated and [7 max = 2, the algorithm 
can only add further /3-variables, that is, further dictionary elements. Thus, in the third iteration another 
sinusoid is added which appears directly on the two pathways. Finally, another sinusoid is activated. 

We note that the graph in (h) is quite similar to the one in (c). Nevertheless, the reconstruction is not 
perfect, as we allowed the algorithm to find only two pathways, while indeed there are three. In this 
example the second pathway corresponding to the angle of arrival of 13° is missing. 

C. Model selection device 

As mentioned above, we stopped the algorithm after 9 iterations. Then we used the graph of the squared 
error to determine the number of variables in the model by the point where the curve exhibits a kink. 
The kink is determined by computing the convex hull of the graph. Figure |2] illustrates the evolution of 
the squared loss for a sample with a signal-to-noise ratio of 5 dB. The curve has a slight kink at 5, which 
is the appropriate number of variables in this case, as the signal is composed of three sinusoids and we 



TABLE I 

Empirical mean and standard deviation of the total number of selected variables of a- and 

/3-COEFFICIENTS WITH Umax = 2. 

SNR total nb nb of q's nb of /3's 

5 dB 5 (0) 2 (0) 3 (0) 

-5 dB 5 (0) 2 (0) 3 (0) 

-15 dB 5.1 (0.3) 2 (0) 3.1 (0.3) 
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Fig. 2. Graph of the squared error of a dataset with a signal-to-noise ratio of 5 dB. The mark indicates the kink, that is the 
selected number of model variables. 
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Fig. 3. Comparison of RMSE of the DOA and TOA for the elementary method and the new estimator for different signal-to-noise 
ratios applied on datasets of size 5 x 1280. The RMSE values are based on 50 repetitions per noise level. 
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Table U shows the empirical mean and standard deviation of the total number of selected variables, 
the number of a- and /3-coefficients for different signal-to-noise ratios over 50 repetitions. Obviously, 
in this example the model selection tool works very well, as the model selection device successfully 
distinguishes 'true' variables from noisy variables, even though the algorithm is prevented from selecting 
all 'true' variables. Hence, the introduction of an upper bound U max for the number of a-coefficients 
does not corrupt the model selection. 

D. Comparison with an elementary method 

A simple, elementary method for estimating the principal DOA in the radar setting consists in first 
estimating the TOA by thresholding and then using the point of observation at the TOA to fit the angle of 
arrival. More precisely, to detect the TOA by the elementary method, the absolute value of the observed 
signal at some time point t is compared to some threshold, here 3a, where a 2 is the variance of the 
noise distribution that is supposed to be known for the elementary method. The first point t where this 
happens is considered as the TOA. This detection method fails when the signal-to-noise ratio is too low, 
since then the signal is too weak compared to the noise such that the observed signal never exceeds the 
threshold. And when no TOA is detected, the elementary method does not provide an estimate of the 



We compare this elementary method to our OMP-type algorithm in the following setting with an 
FSK-modulated Barker sequence. We consider the 7-bit Barker sequence [1, 1, -1, -1, 1,-1, -1] with 
FSK-modulation. Every bit is represented by a sinusoid of length 80 ns with frequency 210 or 230 MHz. 
As there are four sign changes in the sequence, the transmitted signal is composed of four consecutive 
sinusoids of varying length. Furthermore, we consider two pathways (U = 2) with unknown angles of 
arrivals (6°and 42°) and delays of 10 and 40 ns. 



Empirical mean and standard deviation (in parentheses) of the parameter estimates obtained by the 



DOA. 



TABLE II 



ELEMENTARY METHOD. 



Elementary Method 



1st angle (DOA) 1st delay (TOA) 



true 



6° 



10 ns 



SNR 20 dB 5.99 (0.26) 



10.94 (0) 



SNR dB 



6.81 (8.43) 



22.31 (12.11) 
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Again, the sample frequency is F s = 1.28 GHz and we use a ULA of 5 sensors separated by half a 
wavelength of the actual narrowband source signals. The observation time is 1 /xs, such that Y is a vector 
of length 6400. The dictionary contains the sine and cosine functions with frequencies 200, 202,. . . , 240 
MHz, lengths 50, 55,. . . , 170 ns and starting points 0, 5, 25,. . . , 2555 ns, where 5 = 1/F S . The size of 
the dictionary is thus J = 268 800. We use the grids {1°, 2°,..., 90°} and {0, 5, 25,..., 2555} for the 
angles and the delays, respectively Hence, w is a vector of dimension 6.19 • 10 9 . Data are simulated with 
different levels of the signal-to-noise ratio. Again we choose U max = 2. 

Figure [3] compares the root mean square error (RMSE) of the DOA and TOA estimates obtained by 
both methods based on 50 simulated datasets for each noise level. Compared to the elementary procedure 
the OMP-type algorithm achieves a gain of around 20 dB. This gain is mainly due to the fact that the 
algorithm takes into account all observations and not only a single point. For low signal-to-noise ratios 
the elementary method completely fails. 

E. Parameter Estimates 

It is important to note that in contrast to the elementary method the OMP-type algorithm provides 
estimates not only of the DOA and the TOA, but of all the other parameters including the waveform. 
Indeed, in the setting described above our algorithm identifies several consecutive sinusoids. Table |TT] 
presents the empirical mean and standard deviation of the estimates obtained by the elementary method 
and Table [III] gives the results for the OMP-type method, namely for the estimates of the angles of arrival, 
the associated delays, the frequencies of the four detected sinusoids, their lengths and their starting points. 
Estimates are especially accurate for the angles of arrival and the frequencies, whereas the estimates of 
the lengths of the sinusoids and their starting points are less accurate. There is a clear tendency to 
overestimate the length, especially for the shortest sinusoid. Indeed, for high noise levels it is not clear if 
the OMP-type algorithm detects the shortest (and less energetic) sinusoid. The fourth sinusoid generally 
has not even the correct frequency, so it is possible that the algorithm adds any non relevant predictor, 
because the shortest sinusoid is too weak for detection. 

We observe that the reconstructed radar signal s has not exactly the form of a modulated signal due 
to the overlapping sinusoids. It is hence up to the user to make such an interpretation of the signal. 
Nevertheless, we emphasize that the algorithm works for any type of modulation (FSK or PSK) without 
prior knowledge on the modulation and it provides a good idea of the waveform. 
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VII. Conclusion 

The algorithm proposed in this paper is to the best of our knowledge the first method addressing the 
problem of estimating both the DOA and the unknown waveform of a transmitted radar signal in the 
presence of multiple propagation pathways. The new method is in the spirit of OMP extended to more 
sophisticated sparsity structures. The scaling of the OMP-type algorithm has been considered, which is 
necessary for unbiased estimation. A simulation study illustrates the good performance of the new method 
and exhibits a considerable improvement with respect to some elementary method. 

We would like to emphasize the general character of our method, which is of much interest for the 
intercepted radar signal setting. First, the method proposed here is not restricted to a specific array 
geometry. Second and more importantly, the approach can be adapted to a large variety of signals such 
as sinusoids, chirps, composed signals etc. by using a dictionary where a sparse representation of the 
signal is possible. 
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TABLE III 

Empirical mean and standard deviation (in parentheses) of the parameter estimates obtained by the 

omp-type algorithm. 



OMP-type algorithm 





1st angle 


1st delay 


2nd angle 


2nd delay 


true 


6° 


10 ns 


42° 


40 ns 


20 dB 


6.00 (0) 


10.94 (0) 


42.0 (0) 


40.6 (0) 


dB 


5.94 (0.24) 


10.5 (1.34) 


41.9 (0.27) 


40.2 (1.26) 




1st frequ. 


2nd frequ. 


3rd frequ. 


4th frequ. 


true 


210 MHz 


230 MHz 


210 MHz 


230 MHz 


20 dB 


230.0 (0) 


210.0 (0) 


230.0 (0) 


210.0 (0) 


dB 


230.0 (0) 


210.0 (0) 


230.1 (0.40) 


210.0 (0) 




1st length 


2nd length 


3rd length 


4th length 


true 


160 ns 


160 ns 


80 ns 


160 ns 


20 dB 


160.0 (0) 


160.6 (1.64) 


162.4 (12.1) 


170.0 (0) 


dB 


163.1 (4.62) 


165.5 (4.20) 


162.4 (15.1) 


170.0 (0) 




1st start 


2nd start 


3rd start 


4th start 


true 


ns 


160 ns 


320 ns 


400 ns 


20 dB 


0(0) 


158.8 (1.54) 


265.3 (12.9) 


389.1 (0) 


dB 


0(0) 


153.1 (5.37) 


271.0 (19.7) 


389.5 (1.34) 
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